Author name disambiguation: What difference does it make in author-based citation analysis?
نویسندگان
چکیده
In this paper, we explore how strongly author name disambiguation (AND) affects the results of an author-based citation analysis study, and identify conditions under which the commonly used simplified approach of using surnames and first initials may suffice in practice. We compare author citation ranking and co-citation mapping results in the stem cell research field 2004-2009 between two AND approaches: the traditional simplified approach of using author surnames and first initials, and a sophisticated algorithmic approach. We find that the traditional approach leads to extremely distorted rankings and substantially distorted mappings of authors in this field when based on firstor all-author citation counting, whereas last-author based citation ranking and co-citation mapping both appear relatively immune to the author name ambiguity problem. This is largely because romanized names of Chinese and Korean authors, who are very active in this field, are extremely ambiguous, but few of these researchers consistently publish as last authors in by-lines. We conclude that more earnest effort is required to deal with the author name ambiguity problem in both citation analysis and information retrieval, especially given the current trend towards globalization. In the stem cell field, where lab heads are traditionally listed as last authors in by-lines, last-author based citation ranking and co-citation mapping using the traditional simple approach to author name disambiguation may serve as a simple workaround, but likely at the price of largely filtering out Chinese and Korean contributions to the field as well as important contributions by young researchers.
منابع مشابه
بهبود صحت ابهامزدایی نام نویسنده با استفاده از خوشهبندی تجمّعی
Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...
متن کاملA Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Digital Libraries
This paper addresses the problem of name disambiguation in the context of digital libraries that administer bibliographic citations. The problem occurs when multiple authors share a common name or when multiple name variations for an author appear in citation records. Name disambiguation is not a trivial task, and most digital libraries do not provide an efficient way to accurately identify the...
متن کامل"I Cannot Tell What the Dickens His Name Is": Name Disambiguation in Institutional Repositories
INTRODUCTION Authors who publish under more than one form of their name, multiple authors with the same name, and incomplete author information can all create challenges for repository staff when entering metadata. Unless properly addressed, these variations and duplications can result in search and retrieval errors for users. Name disambiguation, the process of identifying, merging, and making...
متن کاملReducing Fragmentation in Incremental Author Name Disambiguation
Author name ambiguity is a hard problem that occurs when several authors publish articles with the same name or when a same author publishes their articles under different names. Traditionally, automatic disambiguation methods process the author names of all citation records in a repository. Aiming efficiency, incremental methods disambiguate author names only when new citation records are inse...
متن کاملA tool for generating synthetic authorship records for evaluating author name disambiguation methods
0020-0255/$ see front matter 2012 Elsevier Inc http://dx.doi.org/10.1016/j.ins.2012.04.022 ⇑ Corresponding author at: Departamento de Ciên E-mail addresses: [email protected] (A.A. F dcc.ufmg.br (A.H.F. Laender), [email protected] 1 Here regarded as a set of bibliographic informati particular article. The author name disambiguation task has to deal with uncertainties related to the possib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JASIST
دوره 63 شماره
صفحات -
تاریخ انتشار 2012